Continuous State Dynamic Programming via Nonexpansive Approximation
نویسندگان
چکیده
منابع مشابه
Continuous State Dynamic Programming via Nonexpansive Approximation
This paper studies fitted value iteration for continuous state numerical dynamic programming using nonexpansive function approximators. A number of approximation schemes are discussed. The main contribution is to provide error bounds for approximate optimal policies generated by the value iteration algorithm. Journal of Economic Literature Classifications: C61, C63
متن کاملHybrid Methods for Continuous State Dynamic Programming
We propose a method for solving continuous-state and action stochastic dynamic programs that is a hybrid between the continuous space projection methods introduced by Judd and the discrete space methods introduced by Bellman. Our hybrid approach yields a smooth representation of the value function while preserving the computational simplicity of discrete dynamic programming. The method is espec...
متن کاملApproximation of fixed points for a continuous representation of nonexpansive mappings in Hilbert spaces
This paper introduces an implicit scheme for a continuous representation of nonexpansive mappings on a closed convex subset of a Hilbert space with respect to a sequence of invariant means defined on an appropriate space of bounded, continuous real valued functions of the semigroup. The main result is to prove the strong convergence of the proposed implicit scheme to the unique solutio...
متن کاملSymbolic Dynamic Programming for Continuous State and Observation POMDPs
Point-based value iteration (PBVI) methods have proven extremely effective for finding (approximately) optimal dynamic programming solutions to partiallyobservable Markov decision processes (POMDPs) when a set of initial belief states is known. However, no PBVI work has provided exact point-based backups for both continuous state and observation spaces, which we tackle in this paper. Our key in...
متن کاملSymbolic Dynamic Programming for Continuous State and Action MDPs
Qa := ∫ Qa ⊗ P (xj|b,b ′,x, a,y) dxj [Symbolic Substitution] For all bi in Qa Qa := [ Qa ⊗ P (bi|b,x, a,y) ] |b′i=1 ⊕ [ Qa ⊗ P (bi|b,x, a,y) ] |b′i=0 [Case ⊕] Compute final Q-Value (discount and add reward): Qa := R(b,x, a,y)⊕ (γ ⊗Qa) Note that ∫ f (xj)⊗δ[xj−h(z)]dxj = f (xj){xj/h(z)}where the latter operation indicates that any occurrence of xj in f (x ′ j) is symbolically substituted with the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computational Economics
سال: 2007
ISSN: 0927-7099,1572-9974
DOI: 10.1007/s10614-007-9111-5